Automated Information Extraction using Amorphic

نویسنده

  • Dawn G. Gregg
چکیده

The Amorphic system is an adaptive web information extraction scheme for building intelligent systems for mining information from web pages. It can locate data of interest based on domain-knowledge or page structure, can automatically generate a wrapper for an information source, and can detect when the structure of a web-based resource has changed and act on this knowledge to search the updated resource to locate the desired information. This allows Amorphic to adapt to changing structures of websites allowing users to manage their information extraction more effectively. Five different example implementations are described to illustrate the need for information extraction systems capable of extracting information from semi-structured web documents. They demonstrate the versatility of the system, showing how a system, like Amorphic, can be used in systematic data extraction applications that require data collection to be conducted over an extended period of time. The current Amorphic system represents a cost-effective approach to developing large-scale adaptable information extraction systems for a variety of domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Laser Raman Studies of Polycrystalline and Amorphic Diamond Films

This report describes the results of a number of different, but related, laser Raman studies on various CVD diamond samples. It attcmpts to show the versatility of Raman spectroscopy as a diagnostic tool for the quality of diamond films, by demonstrating its use in a few novel applications. The studies of the laser Raman spectra of amorphic diamond and CVD diamond films are performed using lase...

متن کامل

N ov 2 00 5 On amorphic C - algebras

An amorphic association scheme has the property that any of its fusion is also an association scheme. In this paper we generalize the property to be amorphic to an arbitrary C-algebra and prove that any amorphic C-algebra is determined up to isomorphism by the multiset of its degrees and an additional integer equal ±1. Moreover, we show that any amorphic C-algebra with rational structure consta...

متن کامل

Automated Data Extraction from Online Social Network Profiles: Unique Ethical Challenges for Researchers

As the use of online social networking (OSN) sites is increasing, data extraction from OSN profiles is providing researchers with a rich source of data. Data extraction is divided into non-automated and automated approaches. However, researchers face a variety of ethical challenges especially using automated data extraction approaches. In social networking, there has been a lack of research tha...

متن کامل

6 On amorphic C - algebras

An amorphic association scheme has the property that any of its fusion is also an association scheme. In this paper we generalize the property to be amorphic to an arbitrary C-algebra and prove that any amorphic C-algebra is determined up to isomorphism by the multiset of its degrees and an additional integer equal ±1. Moreover, we show that any amorphic C-algebra with rational structure consta...

متن کامل

DNA profiling from heroin street dose packages.

A large amount of heroin street doses are seized and examined for drug content by the Israel police. These are generally wrapped in heat-sealed plastic. Occasionally it is possible to visualize latent fingerprints on the plastic wrap itself, but the small size of the plastic item and the sealing process makes the success rate very low. In this study, the possibility of extracting and profiling ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007